Section: New Results

Deep CNN for Modelling Deformations and Semantic Segmentation

Paticipant: Iasonas Kokkinos

Invariance to deformations in Deep Convolutional Neural Networks (DCNN) is commonly achieved by using multiple ‘max-pooling’ (MP) layers. In [26] we show that alternative methods of modeling deformations can improve the accuracy and efficiency of DCNNs. For this, (i) we introduce epitomic convolution as an alternative to the common convolution-MP cascade of DCNNs, (ii)we introduce a Multiple Instance Learning algorithm to accommodate global translation and scaling in image classification and (iii) we develop a DCNN sliding window detector that explicitly, but efficiently, searches over the object’s position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on Pascal VOC2007.

In [25] we bring together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (”semantic image segmentation”). We overcome the poor localization property of deep networks by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Qualitatively, our “DeepLab” system is able to localize segment boundaries at a level of accuracy which is beyond previous methods.